"La force du tonnerre 3"


Author : the beauty spot ; oostyx (at) gmail (dot) com (serious projects/questions on TF3/General RH : stupid question => no reply)
Version : December 2010 (2nd version)
Tools used : IDA Disassembler, mida, Regen, GensKMod, calc.exe, Hex Workshop, Linux env / BASH, a brain
Documents used : VDP Basics, Sega Genesis Hardware, Motorola 68000 Instructions Set, Sonic Retro website


    This is a document where I give a lot of informations about Thunder Force 3 structure, so some future interested people could edit it easily. I am really sorry for my english, I'm european, and I think a more serious english translation could be a good idea. Thunder Force 3 is for me the best shmup of the Genesis, and I never saw any hack of shmups, so I thought it was a good idea to make this game be edited, now it's possible. If you don't understand some technical informations, you should go to the best RH on Genesis website : Sonic Retro. Here is another website containing good informations, try to save a lot of tools, because a day, these websites won't exist anymore. Another good website about rom hacking : enhaclopedia.

All informations of this document concern the current ROM : "Thunder Force 3 (JU) [!]" in a bin format..



Table of contents

Lexical introduction
Change Log

Objects
    Objects properties
    Objects categories
    Objects programs
    Objects memory slots
    Objects statics properties
Sounds

    Sound driver
    Voices
    Musics
    Small sounds
Graphics

    Palettes
    General Art
    Level art
    Boss art
    Target shadows
    Story/credits art
Iwanaga Compression

    Compression technique
    Uncompression : The Iwanaga Tree
    Uncompression : Huffman Part
    Uncompression : LZ Part
    Compression Tools
Ohtsuka Compression
Level
    Objects lists
    Level parameters
    Level playlists
    Level information
Other data

    Demo
    Menus
    Credits/End
Conclusion




Lexical Introduction

0xAABBCCDD is an hex number (base 16)
AABBh is an hex number
1234 is a decimal number

Every number which contains a letter is an hex number, this document uses only hex notation and decimal notation.

68000 / M68K : main processor of the SEGA Genesis
tile : square of 8x8 pixels, all graphical entities are made of tiles
mapping/"maps" : organisation of tiles (on a sprite = sprite mappings, on a level : plane mappings)
plane : it is a layer used to display the level tiles
chorus : special sound used to play kind of "voices" : "CLAW !" for example
object : game entity, displayed with a sprite, which interacts with the player
main actor/main ship/styx : the styx ship, controlled by the player, the main actor is a specific game object
program/code : set of Motorola 68000 instructions, contained in the game, to execute a specific action
subroutine : functions/procedure/block of Motorola 68000 instructions usable/callable anywhere in the game's code
RAM : memory of the M68K, from 0x00FF0000 to 0x00FFFFFF, when I give a ram adress equals to 0xFE00, its real adress is 0x00FFFE00 (similar for the Genesis to 0xFFFFFE00), when an adress is given without a "RAM specification", it's a standard ROM adress (offset in the ROM file)

When I give a name for a subroutine, it's not the original subroutine name, used by developpers, but just a name I give to the subroutine, according to its content/function, to have a better description
of the code blocks instead of having an adress.


Change Log

December 2010 :  Added ChangeLog, Ohtsuka compression, general art / bosses / target shadows modifications about format, subroutine names updated.


Objects

Object are entities we considerate as sprites, they interacts with the main ship, it can be fireshots, lasers, enemies, bonus, etc...
Editing these entities is interesting, and I found a lot of informations to edit them, especially the static properties which have the same structure for each standard object.

Object properties

Object are entities loaded during a level, every ennemies are objects, bosses too, bonus items too...
An object has the several properties :
    - program = used to execute the behavior of the object/to update dynamic data of the object
    - graphics = used to display the object, it's always some sprites, added together to generate a big objects, or sprite displayed one after another to make an animation
    - memory = used to save dynamic data of the object

Objects categories

I found that dynamic data of objects are stored in three specific RAM segments on the game.

SEGMENT START (RAM)SEGMENT END (RAM)SLOT SIZE
Main ship fires0xAA000xADFF0x40
Energy ball0xAE000xB0FF0x30
Standard objects0xB1000xBFFF0x60

This RAM segments are used during a level to store dynamic data of objects. Main ship fires & energy ball are special objects, and are note considerate as standards objects, they are
stored in two different segments, separated from standard objects. Each segment is an array of slots, when an object is loaded in RAM, a slot is filled into the dedicated segment. Main ship fires and energy balls are very small objects so their slots are smaller. There is another specific object : the styx ship.

In the rest of the document, you must consider an object as a standard object (not a styx, a fireball, or a main ship fire).

Objects programs

Each object has its own dedicated subprogram, you must edit it to modify the complete behavior of an object. To have a good mapping between objects and programs, each object has its own ID number, the first object is the 0x07 (bonus ship), and the last is the 0x9F (boss explosion).

Here is a table program location for each object of the game : Objects Program Locations.

The main routine which execute object programs, is called Object, and located at 0x81A4, the offset table of object is located just after this subroutine.

At each new picture calculation, objects are up to date, a dedicated subroutine reads all entries of the object segment, if a slot is containing some dynamic data, it means an object is in here, and must be up to date, the ID is read, and with this ID, the main code knows where the corresponding subprograms of the object is located, and make a branchement on it.

Lot of objprograms are organized in many routines, a "main code" is executed, and after a routine number is read in the object slot, so another part of the program is executed, it's done by using the subroutine that I called ObjBranchToRoutine.

ROUTINE BRANCHEMENT EXAMPLE ON THE OBJ 0F
ROM:00008A1C Obj0F:                                  ; DATA XREF: ROM:00008210t
ROM:00008A1C                 move.w  #$4270,d0
ROM:00008A20                 btst    #1,$11(a6)
ROM:00008A26                 beq.s   Obj0F_8A2C
ROM:00008A28                 move.w  #$4288,d0
ROM:00008A2C
ROM:00008A2C Obj0F_8A2C:                             ; CODE XREF: ROM:00008A26j
ROM:00008A2C                 move.w  d0,$24(a6)
ROM:00008A30                 movea.l #Obj0F_Routines,a0
ROM:00008A36                 bra.w   ObjBranchToRoutine
ROM:00008A36 ; ---------------------------------------------------------------------------
ROM:00008A3A Obj0F_Routines: dc.w Obj0F_r00-jObj     ; DATA XREF: ROM:00008A30t
ROM:00008A3A                                         ; ROM:Obj0F_Routinest ...
ROM:00008A3C                 dc.w Obj0F_r01-jObj
ROM:00008A3E                 dc.w Obj0F_r02-jObj

Here is a small explanation on how a simple object program has to be read, and analyzed : Object1F Program.

Objects memory slots : structure and accessing

Standard objects have slots to save dynamic data, and some informations are located at the same place for most objects. Here is a description of slots offsets, which identifies some variables in an object memory slot, programs modifies these slots by using an offset on an adress register, #$26(a6) will access the X coordinate of the object on the screen, the IDA disassembly that I've set up must
be better by using enums, and renaming all offsets used on objects, so you can see what real variable is accessed instead of a random number !

The a6 register always contains the beginning adress of the slot of the currently executed object.

Go to Obj1F Program, to have information about memory slot variables commonly used.

Objects static properties

A lot of objects are enemies or lasers. To decrease the size of the final game, developers chose to set some static properties for each object.
There is a big table in the game, which contains static properties of each object : the first table entry gives the 07's properties, the second table entry gives the 08's properties and so on.

This table can be found at 0x7814 on the ROM !

Here is the table, which contains object descriptions (not completed), program adresses are wrong on this file, use the previous document : Objects Static Properties/Description.

Object static properties format

byte 0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Bitfield

Health Points

Score earned

UNK : specific to object ? Parameters for object subroutine ?


Here is the specific Bitfield format :

bit 15

14

13

12

UNK : Always 0 ??


bit 11

10

9

8

UNK

UNK

Invincible if set

Reaction with Styx's fire if set


bit 7

6

5

4

UNK

THEORY : object pasted to the screen ?
(follow the scroll)

Specific subroutine executed for the object if set. The beginning of the subroutine is in ROM at ADDRESS :(0x81F2+(OBJECT_IDx2))

Multiple explosion ?


bit 3

2

1

0

Targeted by the Styx's 5th weapon if clear

Explosion style ?

UNK

UNK


Here is an example of modifying these properties, this is really powerful.

Objects art

Object art location depends on level objects are executed. There is an object art bank for each level, and a final one for general purposes (bonus ships/items/styx).
An object art bank is a bank of sprite, and contains both tiles and sprites mappings, it's compressed, see Graphics for more information about these banks. I never find what data make the link between a sprite contained in a bank and the sprite displayed for an object, it can be an offset loaded in a variable of the object slot, or a sprite number, fixed by the object program, studying the object slot's variables is required to understand the real link.

 

Sounds

The game contains sound data, this data is separated in three formats : sounds, chorus and musics. All of these sounds seems to be executed by the Z80. The Z80 is another processor of the SEGA Genesis, it must run a program (a driver) to communicate with the M68K.

Sound driver

I didn't make good researchs on it, it's possible that it's located at the adress 0x16680, it must be disassembled separately, as a Z80 program to be understood.
The subroutine that seems to load the driver is the one I called SoundDriverLoad, located at 0x51B0.
The driver seems to be 0x197F long (as it's specified in the loading program => dbf operand).

Chorus/Voices

Chorus locations are described in an offset table at 0x73478 (VoicesPtrTable), it describes locations for 7 chorus, and the 7th one seems to be unused on the game.
Voices are executed as requests, like musics I think. A bit of code require a voice play by calling the subroutine PlayVoice (0x54DC), and then the voice requested is really launched
by another piece of code that I named ExecuteVoiceRequest (0x54E2).

Voices related locations
 0000:0000411A Abs   PlayClawVoice (subroutine to request the specific 'CLAW' voice)
 0000:000054DC Abs   PlayVoice (subroutine to request a voice)
 0000:000054E2 Abs   ExecuteVoiceRequest (subroutine which executes the voice requested)

 0000:00073478 Abs   VoicesPtrTable (table of offsets which described voices bank organisation)

 0000:000734C8 Abs   Voice0 (raw data)
 0000:00073D60 Abs   Voice1
 0000:00074798 Abs   Voice2
 0000:00075118 Abs   Voice3
 0000:00075AF0 Abs   Voice4
 0000:00076350 Abs   Voice5
 0000:00076A00 Abs   Voice6
 0000:00077150 Abs   Voice7_Bad
(raw data unused ?)

I have no information about the format, I think this data is not compressed.

Musics

There are two banks of music in the game, the first one contains all the lasts musics (as they are numbered in the sound test), and the second one contains all the firsts musics.
Each bank is introduced by a MusicPtrTable, the first is at

MUSIC RELATED LOCATIONS
 // SUBROUTINES
 0000:0000107A Abs   PlayBossMusic

 0000:00005216 Abs   PrepareMusic1

 0000:00005240 Abs   PrepareMusic2
 0000:00005284 Abs   PlayMusicExec
 0000:000052A0 Abs   PlayMusic

   // DATA ????
 0000:00006B18 Abs   MusicTestData1
 0000:00006B32 Abs   MusicTestData2
 // BANK 1
 0000:00070000 Abs   MusicPtrTable1

 
 0000:00070020 Abs   Music12
 (ellis boss) 
 0000:0007047F Abs   Music13 (orn base boss)
 0000:00070795 Abs   Music14 (final boss)
 0000:00070EA8 Abs   Music15 (main theme)
 0000:000714B2 Abs   Music16 (stage select)
 0000:00071A1A Abs   Music17 (level end)
 0000:00071BB3 Abs   Music18 (continue)
 0000:00072036 Abs   Music19 (game over)
 0000:000722F3 Abs   Music20 (end story)
 0000:000726EB Abs   Music21 (staff)
 // BANK 2
 0000:00078000 Abs   MusicPtrTable2


 0000:00078020 Abs   Music0 (hydra)
 0000:00078D6F Abs   Music1 (gorgon)
 0000:00079953 Abs   Music2 (seiren)
 0000:0007A6B5 Abs   Music3 (haides)
 0000:0007B088 Abs   Music4 (ellis)
 0000:0007B981 Abs   Music5 (cerberus)
 0000:0007C7DE Abs   Music6 (orn base)
 0000:0007D462 Abs   Music7 (orn core)
 0000:0007DA3E Abs   Music8 (hydra boss)
 0000:0007E144 Abs   Music9 (gorgon boss)
 0000:0007E53B Abs   Music10 (seiren boss)
 0000:0007E9E4 Abs   Music11 (haides boss)

MUSIC data seems to be in the SMPS 68K  format, but I'm not enough strong on SEGA GENESIS music formats to be totally sure of that. Please go see this to understand better : http://info.sonicretro.org/SCHG:Music_Hacking. If it"s that, I think musics of Thunder Force 3 can be included in some Sonic games, TF3 musics are not compressed, no decompression routine is executed by the m68k...

I think the music playing is done by the same system of request/execution used to play chorus, PlayMusic subroutine is used to request a music to be played, and PlayMusicExec really begins the music lastly requested.

Small sounds

Other data seems to be sounds played which are not voices or musics, I don't study this, but I have some locations & subroutines :

SOUND RELATED LOCATIONS
 // SOUND RELATED SUBROUTINES
 0000:0000532C Abs   PlaySound1 (with priority management)
 0000:0000534E Abs   PlaySound2 (without priority management)
 0000:00005368 Abs   ExecuteSoundRequest
 0000:0000FCAE Abs   ObjPlaySound (similar to PlaySound1 but always called by objects programs)

   // SOUND DATA
 0000:00077300 Abs   SoundBank_Header
 0000:00077308 Abs   SoundBank_PtrTable
 0000:0007738A Abs   SoundBank_Data

 0000:0007F300 Abs   Sound_Unk

PlaySound1 & PlaySound2 write a requested sound in a memory case, but the first version seems to not write the sound requested if another case is filled with a certain value, maybe it's
a kind of priority management between sounds (because someones are more important, like the EMERGENCY sound).

SoundBank_PtrTable seems to be read and interpreted by the Z80 because the table contains 16-bits offsets but in Little Endian, each Little Endian offset gives the location of a small sound of the game, (they have a very small size), so there is 65 offsets in this table, giving a location in SoundBank_Data. SoundBank_Data contains raw sounds, as a list, I didn't generate a data block for each sound because I'm lazy. SoundBank_Header contains 4 words I don't understand, probably used by the Z80.

All data related to small sounds like that are never used by the 68000 code.

Sound_Unk contains data related to general sounds, maybe it is used by the Z80 as some parameters ? The 68000 code does not use this data and I don't know how it can be used, it seems corruption does not change anything... Beta/debugging features ???


Graphics


Because the program being reversed is a video game, it contains a lot of graphical data, we must introduce three format of data :
    - palettes : a palette is an array of 16 numbers of 16-bits, it is used to apply a color to graphics, the SEGA Genesis can use 4 palettes at the same time
    - tilesets : it's a set of different tiles, used to display something as an organisation of tiles
    - mappings : it's the organization of tiles described to display something correctly , it can be a map (a level layer), or a sprite, or a sentence for exemple
    - sprites : it's a graphical unit composed of a mapping of tiles, which can have a lot of different sizes

All of this data is loaded in the SEGA Genesis VDP memory : the VDP is a chip dedicated to graphical purposes. It seems that all data which is not palette data is compressed.

Palettes

The easiest thing to edit is the colors, so the palettes here is palettes I find in this game, I think I didn't find all of them, because this data is very fragmented and difficult to find... But you have
enough here to modifiy main colors of the game.

Palettes locations Warning ! Boss palettes are in an only one data block (16 bytes-jump for next boss palette, beginning by hydra boss I suppose).

There are 4 palettes per level :
    - one palette for background
    - one palette for foreground
    - one palette for level objects
    - one palette for general purposes : main ship, fires, shots, energy balls...

While the boss is loading, objects palettes are replaced by boss palettes.

General art

General art includes all art used to display objects or graphics used on all the game (not level specific or boss art). It includes art of the styx ship, fireshots, energy balls,
planets, digits, letters... I have find a bank of art which contains all the general purposes graphics... It includes all format of art data which is not palette. This data is compressed using Ohtsuka compression.

GENERAL ART RELATED LOCATIONS
 0000:0005A080 Abs   GeneralArtPtrTable

 0000:0005A0A0 Abs   GeneralArt0 (styx, fires and hits)
 0000:0005BB80 Abs   GeneralArt1 (letters, icons, black BGs, energy ball & missiles)
 0000:0005C74E Abs   GeneralArt2 (bonus ship)
 0000:0005CBD4 Abs   GeneralArt3 (planets, lvl loading screen borders)
 0000:0005DF7A Abs   GeneralArt4
 0000:0005E056 Abs   GeneralArt5

This data bank is introduced by the GeneralArtPtrTable, an offset table, giving access of all subbanks.

Level art

Here is the game's pandora box, the thunder's core : level graphics. This is the most exciting data, because level art can be edited, but a great problem is that this data is compressed using Iwanaga compression.

Each level has a set of three graphics bank : 
    - one containing foreground/background tiles
    - one containing foreground/background tilemap (mappings of the entire level layers)
    - one containing object's sprites (tiles & sprites mappings)

LEVEL ART LOCATIONS
           [OBJECTS ART BANK]        [PLANE A/B TILES BANK]    [PLANE A/B MAPPINGS BANK]
HYDRA               00020122            0002E9C2                000395CF
GORGON              00021CDF            0002FE5E                0003AEC4
SEIREN              00023BB1            00030F73                0003BEAC
HAIDES              00025C54            00031D78                0003D22A
ELLIS               00027E76            00032E38                0003E6B5
CERBERUS            00029DB3            00033DFE                0004028C
ORN_BASE            0002BBEB            00035B21                00041EF6
ORN_CORE            0002D801            000386E3                00043B34
 0000:00020000 Abs   LevelCprData

 0000:00020122 Abs   CprSprites_Hydra
 0000:00021CDF Abs   CprSprites_Gorgon
 0000:00023BB1 Abs   CprSprites_Seiren
 0000:00025C54 Abs   CprSprites_Haides
 0000:00027E76 Abs   CprSprites_Ellis
 0000:00029DB3 Abs   CprSprites_Cerberus
 0000:0002BBEB Abs   CprSprites_OrnBase
 0000:0002D801 Abs   CprSprites_OrnCore
 
 0000:0002E9C2 Abs   CprTiles_Hydra
 0000:0002FE5E Abs   CprTiles_Gorgon
 0000:00030F73 Abs   CprTiles_Seiren
 0000:00031D78 Abs   CprTiles_Haides
 0000:00032E38 Abs   CprTiles_Ellis
 0000:00033DFE Abs   CprTiles_Cerberus
 0000:00035B21 Abs   CprTiles_OrnBase
 0000:000386E3 Abs   CprTiles_OrnCore
 
 0000:000395CF Abs   CprMaps_Hydra
 0000:0003AEC4 Abs   CprMaps_Gorgon
 0000:0003BEAC Abs   CprMaps_Seiren
 0000:0003D22A Abs   CprMaps_Haides
 0000:0003E6B5 Abs   CprMaps_Ellis
 0000:0004028C Abs   CprMaps_Cerberus
 0000:00041EF6 Abs   CprMaps_OrnBase
 0000:00043B34 Abs   CprMaps_OrnCore

All of this locations are introduced by a big offsets table named LevelCprData and located at 0x20000 in the game.

Boss art

Bosses are enormous in this game, so they have their own banks of subsprites. It seems to be compressed using Ohtsuka compression.

BOSS ART RELATED LOCATIONS
 0000:00048100 Abs   BossArtPtrTable

 0000:00048120 Abs   BossArt_Hydra
 0000:0004B3B2 Abs   BossArt_Gorgon
 0000:0004D1D4 Abs   BossArt_Seiren
 0000:0004EC16 Abs   BossArt_Haides
 0000:00050E36 Abs   BossArt_Ellis
 0000:0005362A Abs   BossArt_OrnBase
 0000:00055374 Abs   BossArt_OrnCore

Target Shadows

Target Shadows are special graphics used at the level loading screen to display the shadow of the silhouette wich is supposed to be the monster to kill to beat the level.

Target Shadows LOCATIONS
 0000:0001F400 Abs   TargetShadowPtrTable

 0000:0001F420 Abs   TargetShadow_Hydra
 0000:0001F578 Abs   TargetShadow_Gorgon
 0000:0001F814 Abs   TargetShadow_Seiren
 0000:0001F91C Abs   TargetShadow_Haides
 0000:0001FA2E Abs   TargetShadow_Ellis
 0000:0001FB96 Abs   TargetShadow_Cerberus
 0000:0001FC5A Abs   TargetShadow_OrnBase
 0000:0001FD56 Abs   TargetShadow_OrnCore

It seems this data is compressed using Ohtsuka compression, and it's used to display the target white silhouette on the loading menu.

Story/Credits art

There are some art data used to display pictures at the end of the game, letters for credits, orn base sprite explosions, mappings and tiles of the styx's base.... All of this data is compressed
with Iwanaga, like level art data.

End art related locations
 0000:0006561A Abs   CprEndPtrTable

 0000:00065688 Abs   CprEndStars
 0000:000657CF Abs   CprEndOrnExplosion
 0000:00065A68 Abs   CprEndStoryChars
 0000:00065D3F Abs   CprEndOrn
 0000:00066D97 Abs   CprEndBaseTiles
 0000:0006813C Abs   CprEndPicture1
 0000:00069875 Abs   CprEndPicture2
 0000:0006BFB4 Abs   CprEndBaseMaps
 0000:0006C463 Abs   CprEndStyxAndLetters

The CprEndPtrTable introduces banks and parameters required for decompression. See Iwanaga compression for more informations.



Iwanaga Compression

There are two Iwanaga compression formats in the game, the first is used to compress most level graphics, and other graphics ; the second is used to compress the sega logo art. I cracked one of them : the first which is the most important, I developped a software application to uncompress data from the game, but I didn't create the recompression program. I will discuss here informations about the first compression format called Iwanaga because this name is the name of the main programer of the game, so it's dedicated to its good work. The second format is a small modification of the Iwanaga compression, but I didn't crack it, because modifying Sega logo art really doesn't interest me ! This part of the document needs some good computer science/compression/programming knowledge.

Compression technique used

There is a lot of compression technique, used on games, when you're lucky it can be Run Length Encoding, if you're less lucky you can find some LZ77 or derivation of that : a compression technique with a "Window" (data buffer) used to find strings repeated in a data flow. There is a final format used in old consoles : the Huffman Encoding. Each byte of the data is converted to a path in a tree called the huffman tree. Iwanaga is a combination of Adaptive Huffman & some LZ77 derivated : this method is a brother of the famous DEFLATE compression, used in best compression tools like gzip.

To understand the compression technique, you must understand how LZ77 & Huffman Encoding work separately : you should study it a lot :
Huffman Coding
Adaptive Huffman Coding
LZ77 Algorithm
An explanation of the DEFLATE

Uncompression Technique - The Iwanaga Tree

First, the uncompression program build a huffman tree in a complete binary heap structure. In the exact implementation, I remember the game used two heaps, one storing vertices values, and another storing vertices weights.

This tree structure is composed of vertices (nodes), and leaves, leaves are nodes which have no sons. A leaf is always a son of a vertex, and every vertex have two sons, because the tree is a complete binary tree. The "first" vertex of the tree is the one which have no father : it's called the root. Every vertex which is not a leaf have two sons : the left son, and the right son. Sons are vertices which can be vertices of the tree, or leaves.

The tree is used to save unique symbol repetitions. The Iwanaga Tree is used to store one byte in one leaf, because byte can take 0x00 to 0xFF value.. In its technical implementation all vertices which are not leaves have a value from 0x00 to 0x272. If a vertex has a value greater than 0x272, it's a leaf, and you must do : (Leaf.Value - 0x273) to obtain the value of the byte contained in the leaf. These leaves are used for the Standard Huffman uncompression.

There are leaves which are not describing a byte between 0x00 and 0xFF, these leaves have a value from 0x100 to 0x138. These leaves are used for the second context of uncompression : the LZ method.

Each vertex (leaf or other vertex) of the tree has a weight value, each leaf has a weight equal to 0x01, and a weight of a vertex is the sum of weights of its sons, so the father of a leaf have a weight of 0x02, its father have a weight of 0x04... Weights are used to save the number of times a symbol contained in a leaf has been met in the compressed stream. The tree is updated at each time a weight is modified, so the tree evolves during the uncompression process, this technique is used to make length of tree path smaller for bytes or lz patterns frequently discovered in a data stream. The update of the tree is used to sort vertices so that high weights are upper in the tree than low weights, so frequently discovered symbols are easier to access in the tree : the path has a smaller length !

Uncompression Technique - The Huffman Part

The Huffman coding is always used during the uncompression process, to find a path in the Iwanaga Tree. Compressed stream is read, one bit after another, if the bit is clear, the left son is reached, if the bit is set the right son is reached, and so on... A random bit string in the compressed stream gives a leaf to access. If this leaf contains a standard byte, the byte is written in the uncompressed stream, and in a special array used for the LZ part : the lz buffer (also dcpr buffer ; uncompressed data buffer). Then the weight of leaf reached is incremented, and the tree is updated. If the leaf reached contains a lz pattern (value from 0x100 to 0x138), it will execute the lz context (see next part), increment the leaf reached and update the tree.

The Huffman method is not really difficult to understand, the method used is an adaptive huffman, because of the updated tree with weights. Find these techniques in the game code is the hardest thing is ever do, because bit reading process is really difficult to "see" when you read directly asm code, and the heap was a data structure I didn't know. To read a bits one after another, the game use some multiplications by 2 (similar to shift left logical 1), and read the bit value with masks, and the access to the son of a vertex in the heap is done by index specific multiplications....

When a leaf is reached, the uncompressed stream is modified (some bytes added), and then, another iteration of the algorithm is done with left bits on the compressed stream and from the root of the updated tree. The uncompression process stops when the original uncompressed stream size is reached, this size is given in all compressed banks of data in the game, you can find original sizes in PtrTables, for example in the LevelCprData table, at 0x20000, it's the values : 0x5000, 0x6000 & 0x7000... It really depends on the data contained, in this case this banks contains data directly moved into VRAM for each level so bank sizes are the same for each level.

Uncompression Technique - The LZ Part

Here is the most diffcult part of the uncompression process.
Some informations :
- the lz buffer/lz window/dcpr buffer is the buffer of dcpr data saved during the uncompression (it's a small circular buffer, not the final uncompressed stream).
- the lz pointer is a negative offset in the lz buffer to know where the repated string is encountered in lastly uncompressed bytes.
- the count value is the length of the repeared string in the dcpr buffer

To get the repeated string in the dcpr buffer, we must have the lz pointer, and the count value, these value are obtained after some calculations :

So when a LZ Pattern is reached in a leaf of the Iwanaga Tree, the LZ Context is executed.
The game read, immediately after the bits' path given to reach the lz_pattern, a string of 8 bits (a byte), this byte is used for the followings :
    - it's used to calculate the lz pointer (the pointer used to know where the repated string is in the lz window/dcpr buffer)
    - it's used as an index of two arrays, used to calculate the lz pointer.

The two arrays are special array of 0x100 bytes, hard-written in the game.
The first array is called the DcprTable1, at 0x5D00, the second one is called the DcprTable2, at 0x5E00.

To generate the pointer, the game use the following :

Pointer = Index
Val1 = DcprTable1[Index]
Val2 = DcprTable2[Index]
Val2 = Val2 - 2


Then the program read (Val2) bits from the compressed stream (just after the index read before), each bit is added at the end of the Pointer :

while (read Val2-- bits)
    Pointer <<= 1
    Pointer += Bit


And then, Val1 is added to the Pointer :

Pointer &= 0x3F (6 lowest bits)
Val1 <<= 6 (shift left logical 6 bits)
Pointer = Pointer | Val1


so Val1 contains the highest bits of the pointer.

The final pointer can have a value from 0x0000 to 0x0FFF (that's because the dcpr buffer is a circular buffer with a length of 0x1000).
So we have the pointer, we know where the string begins in the lz buffer, then we must have the length of the string to copy !
The length is simply contained in the lz pattern previously reached in the huffman tree :

Length_Of_String = LZ_Pattern - 0xFD

That's why minimal strings compressed with the LZ context have a length of 0x03, the max string length possible is 0x138 - 0xFD => 0x3B.

Then with these two information, the program copy the string into the uncompressed stream AND the uncompressed buffer (lz window/buffer).
When the copy process is completed, the lz context is stopped, and the program make a new iteration of the huffman process, after an update of the tree.

During any LZ copy, the method check if the original uncompression size is not reached, and totally stop the uncompression if yes.

Compression tools

Cracking this compression is the hardest thing I ever done in computer science, and I converted the uncompression process in a C++ program. All sources are available, the compression process is explained in Tree.cpp, this is not very well written, but it runs, and decompress any compressed bank of Thunder Force 3. The bin directory contains the executable release.

I didn't find any similarity with TF2 & TF4 compression formats, I didn't study these games, I really don't like the TF2, and I think TF4, is really too evolved and have not the same graphical & musical style.

The recompression program does not exist, but it can be done ! If you're really interested in developping tools for this game, especially for level editing, and if you think you're not enough strong to rebuild the compression process, if your utility project is really serious, contact me and I'll see if I could build a recompression tool for your projects (serious only, UML charts, code samples, and screenshots required). Only C++ sources + exec release will be done for this recompression program.

For information, the uncompression subroutine is located at 0x5A7A : Uncompress.

The compression tool I give you must be used in a command line, cpr units can be found in Ptr tables which introduces compressed banks. For level banks, here is cpr units :

OBJECT BANK             = 0x5000
Planes Tiles BANK         = 0x6000
Planes Mappings BANK     = 0x7000


You can use CodeBlocks to open the program's project.

Here is a final present : a SDL program which view a level in a big window, you must uncompress banks to use it, here is a demo of this program (go to the second half part of the video to see art viewing of Hydra & Ellis levels). To view the level, you must extract tiles of the level, then mappings of the level, in two separated files, then you must extract in a palette file, the 4 palettes used in the level. You can find an example of palette file format (ornbase_palette.bin), use raw copies with an hex editor, to make your own palettes files. To understand the level mappings format, I recommend
you to read my source code, I remember there are big block of tiles of 32 pixels (like in Sonic's games), so there are mappings organizing the 32px blocks, and then some mappings organizing position of 32px blocks on level planes, I wrote this program one year ago, so I really don't remember the exact format. You should study the program's code, well organized in objects, you can open it with CodeBlocks.


Ohtsuka compression

Ohtsuka compression is nothing more than a simple Run Length Encoding implementation. The game seems to use 4 subroutines using this compression method, I didn't study each subroutine using it, but only one of them. Here is a document where I rewrote the uncompression process in pseudo-C, I give some informations. A simple decompressor/recompressor can be written easily.

I must thank a guy known as doc mefisto, who helps me to find the compression method. I thought Iwanaga was the only compression method of the game, but it seems to use 2 main compression algorithm : Iwanaga & Ohtsuka.


Level

A level is composed of :
    - the main Level subroutine including level loading, and main game engine located at 0xD4C (Level).
    - a music
    - a foreground : called the plane A / layer A (VDP graphical layer name) (tiles and mappings)
    - a background : called the plane B / layer B (tiles and mappings)
    - objects
    - a boss (loaded at the end)
    - a text description : a name, a supposed target, a weakpoint for the target (displayed at the beginning)

I will discuss here how to edit this data, editing foreground and background mappings require a recompression program for banks, because collisions are included in the layer A mappings. and these mappings must be decoded, I will not discuss it here.

Objects list

Here is a good information for level editing : the game is developped with some object lists, each level has its own object list, it is a list of object instanciation during the level.

Level

Objects list rom address (hex)

Objects list size (hex)

Number of entries (hex (dec))

Number of empty slots (dec)

HYDRA

18000

3D0

7A (122)

262

GORGON

18C00

588

B1 (177)

207

SEIREN

19800

3F8

7F (127)

257

HAIDES

1A400

5F0

BE (190)

194

ELLIS

1B000

3F8

7F (127)

257

CERBERUS

1BC00

428

85 (133)

251

ORN BASE

1C800

5A0

B4 (180)

204

ORN CORE

1D400

020

4 (4)

380


Object lists are composed of entries, which can be of two types :
    - an object instanciation entry : these entries are used to generate an object loading during the level execution
    - a non-object instanciation entry : these entries are used to give some parameters to the level engine

Level object entry format - Object Instanciations

byte 0

 1

2

3

4

5

6

7

Counter value

X

Y

Bitfield

ID

Parameters


When a level is executed, there is a RAM variable which is always incremented, I called it the Lvl Counter. Each time the Lvl Counter is incremented the top of the list is checked to see if an object has the same value, and is loaded, so each entry of the list contains a counter value, and all entries are sorted by increasing counter values (if not, the level turns in an endless loop).
X & Y
are Screen coordinates where the object must be loaded,
the bitfield is used to know in which level difficulty the object must be loaded and for other purposes. 
ID
is the value of the entry when it's greater than or equal to 07, it's an object ID (object instanciation), when it's 00 to 06, it's a level parameter.
Parameters
are used for the object program, it's loaded in the object slot in RAM, and interpreted by the object program, so parameters format depends of the object (the id).

Exemple of an object instanciation : the first bonus ship on Hydra (at 0x18028) :

0x00

0x18

0x38

0x18

0xFF

0x07

0x00

0x07

Object loaded when counter value = 0x18

X position : 0x38 (right)

Y position : 0x18 (top)

Loaded on all difficulty level

ID = 0x07
Object = Bonus ship

7 = bonus ship contains the CLAW weapon


You can modify the second 0x07, to choose what item is given by the bonus ship !

By modifying these lists, I generated a rom hack, where the five firsts levels contains only a boss at the beginning. Here is a demo of this hack.

Level parameters

Level parameters are given by object lists entries with an ID from 0x00 to 0x06.

Lvl Param
00

The 00 parameter is the most powerful lvl param used in that game, and it is the ultimate element which make this game greatly editable : the 00 Call is used to change X & Y Velocities of the scroll. In only one line of 8 byte, in the Object list, you can modify velocities of plane A and plane B. These calls were used in the object list to make the main level scrolling program global and editable for every level. Now you know this game has not his scroll hard coded, you can modify it as a string !

The param 00 object list's entry is not like objects entries, because it's really not the same data... All objects have the same entry format but this element is a configuration command for a program, so it has its own syntax. (x & y velocities for plane A & B).

01

The 01 param is used to display a special background pattern, especially for Haides or Gorgon levels.

02

The 02 param is used as a display configuration. It's used to set priorities between planes. It uses bits, and I do not finish to discover parameters of this entry. In the game, it is used to display plane A in front of the sprite layer, in the HYDRA level for example.

03

Today, I don't find the goal of this param. It seems to be used at level beginnings, but by corruption, I have no results. Theory : used by the level to define parts of the plane (top and bottom parts)...

04

The 04 call is used by the HAIDES specific level, to move bottom part of the plane A.

05Not used, don't try to use it, no asm code is dedicated to recognize this param.
06

The param 06 is used to load the boss (to display the emergency message) and to execute the boss program.
There's always two 06 param entries in a level, they are the two finals entries of the object list. Their counter value must be separated of 0x10. The first 06 execute the emergency object program, flush the object loaded in RAM slots and load boss art. The second 06 entry make the boss program begin.


I made a hack using these informations, to modify the scrolling velocity of Hydra, this is a demo of this hack.

Level playlist

Musics played on levels are organized in a playlist at 0x6D4, the first byte give the music number for Hydra, the second for Gorgon and so on...
Change byte numbers to change musics.

Musics played on bosses are organized in a similar playlist at 0x109E.

Level information

Text used as level information
 0000:000065C4 Abs   HydraTxt
 0000:00006600 Abs   GorgonTxt
 0000:00006645 Abs   SeirenTxt
 0000:0000667F Abs   HaidesTxt
 0000:000066B9 Abs   EllisTxt
 0000:000066FD Abs   CerberusTxt
 0000:0000673C Abs   OrnBaseTxt
 0000:0000677A Abs   OrnCoreTxt


Other data

I will discuss here some other specific data : demo, sound test menu, stage select menu, credits, end story...

Demo

The game plays 5 demos, if it is not started by pushing START on the Title Screen, I didn't crack the demo format, but it must describe styx's movements on levels, levels used on demo are decided by asm code, I think demo banks contain only data used to move the styx.

DEMO RELATED LOCATIONS
 0000:0001E000 Abs   Demo_Hydra
 0000:0001E400 Abs   Demo_Gorgon
 0000:0001E800 Abs   Demo_Seiren
 0000:0001EC00 Abs   Demo_Haides
 0000:0001F000 Abs   Demo_Ellis

Some variables in memory must be found to understand the demo launching process, you must analyze the LoadLevel subroutine (0x576) and the Level main subroutine (0xD4C)

Menus

The option menu is managed by a big subroutine located at 0x67B8 (OptionMenu), around here you will find strings used in this menu.
A small subroutine manage the controller layout (speed change, weapon...) : 0x6D2C, strings used are just after it.

The stage select menu is managed by a big subroutine located at 0x6DB4 (StageSelect), around this location you can find all string data used by the menu.

We can see at 0x6F92 strings used by the game to display level names in the stage select menu, we can see names of all levels of the game, and the Cerberus level is called Keberos in this string list,
that's why I think Kerberos is the beta name of the Cerberus level, and I think the stage select menu was originally designed to choose any level of the game, may be it was used as a debugging feature. We can remake the old menu, probably used by developpers , by a really simple but powerful asm hack, given at the end of this document.

Credits/End

Credits strings are located at 0x63BA5.
The story explained about ORN at the end is located at 0x63780.

I didn't exactly locate subroutines used for the end story and for credits...


Conclusion


Here is my final conclusion : I worked a lot on this game, I cracked the compression format and I set up an IDA disassembly, I don't publish it for personnal reasons, if you really want to make a hack for this game, and if you need to have a disassembly (I recover the asm code, and it can be rebuilt), contact me, if your project is good, it can be interesting.

Here is the game's subroutine list.
Here is the game's memory map, a lot of locations are programs labels, you must use grep or other string analyser to filter this list.

mObj is an engine used to apply modifications on object slots when they are deleted at 0x7408.
Fire is an engine used to manage styx's fires at 0x20AE
StyxSprite is an engine used to manage styx main ship operations at 0x15FA.
littleShip is an engine used to manage operations of small ship object, commonly used on Hydra or Ellis, located at 0xEA76.

Here is a final present for your : a very simple but powerful hack, the one which make the original stage select, to chose any level, it can be useful if you want to hack the game :

BETA STAGE SELECT HACK
only 6 bits modified on the all game to obtain all level playable
from the beginning :

0x6E45 : O4 => 07    (0100 => 0111)
0x6E63 : 04 => 07    (0100 => 0111)
0x6E75 : 04 => 07    ;;

Launch the game : all levels can be played !

Because I decided to make new studies, hard studies, I should not work on this game during two years, but I gave you all my knowledge, you have to work on it yourself now, hacking needs a lot of tests on games, trying to edit it, to corrupt it, and you must learn the 68000 assembly language, because the game's program contains all answers we, rom hackers, look for.